4 research outputs found

    ParGeo: A Library for Parallel Computational Geometry

    Get PDF

    Parallel Batch-Dynamic d-trees

    No full text
    d-trees are widely used in parallel databases to support efficient neighborhood and similarity queries. Supporting parallel updates to d-trees is therefore an important operation. In this paper, we present BDL-tree, a parallel, batch-dynamic implementation of a d-tree that allows for efficient parallel -NN queries over dynamically changing point sets. BDL-trees consist of a log-structured set of d-trees which can be used to efficiently insert or delete batches of points in parallel with polylogarithmic depth. Specifically, given a BDL-tree with points, each batch of updates takes ( log2 ( + )) amortized work and (log ( + ) log log ( + )) depth (parallel time). We provide an optimized multicore implementation of BDL-trees. Our optimizations include parallel cache-oblivious d-tree construction and parallel bloom filter construction. Our experiments on a 36-core machine with two-way hyper-threading using a variety of synthetic and real-world datasets show that our implementation of BDL-tree achieves a self-relative speedup of up to 34.8× (28.4× on average) for batch insertions, up to 35.5× (27.2× on average) for batch deletions, and up to 46.1× (40.0× on average) for -nearest neighbor queries. In addition, it achieves throughputs of up to 14.5 million updates/second for batch-parallel updates and 6.7 million queries/second for -NN queries. We compare to two baseline d-tree implementations and demonstrate that BDL-trees achieve a good tradeoff between the two baseline options for implementing batch updates.M.Eng

    Tuplex: Data Science in Python at Native Code Speed

    No full text

    ParGeo: A Library for Parallel Computational Geometry

    Get PDF
    This paper presents ParGeo, a multicore library for computational geometry. ParGeo contains modules for fundamental tasks including kkd-tree based spatial search, spatial graph generation, and algorithms in computational geometry. We focus on three new algorithmic contributions provided in the library. First, we present a new parallel convex hull algorithm based on a reservation technique to enable parallel modifications to the hull. We also provide the first parallel implementations of the randomized incremental convex hull algorithm as well as a divide-and-conquer convex hull algorithm in R3\mathbb{R}^3. Second, for the smallest enclosing ball problem, we propose a new sampling-based algorithm to quickly reduce the size of the data set. We also provide the first parallel implementation of Welzl's classic algorithm for smallest enclosing ball. Third, we present the BDL-tree, a parallel batch-dynamic kkd-tree that allows for efficient parallel updates and kk-NN queries over dynamically changing point sets. BDL-trees consist of a log-structured set of kkd-trees which can be used to efficiently insert, delete, and query batches of points in parallel. On 36 cores with two-way hyper-threading, our fastest convex hull algorithm achieves up to 44.7x self-relative parallel speedup and up to 559x speedup against the best existing sequential implementation. Our smallest enclosing ball algorithm using our sampling-based algorithm achieves up to 27.1x self-relative parallel speedup and up to 178x speedup against the best existing sequential implementation. Our implementation of the BDL-tree achieves self-relative parallel speedup of up to 46.1x. Across all of the algorithms in ParGeo, we achieve self-relative parallel speedup of 8.1--46.61x
    corecore